50 research outputs found

    Detecting and Characterizing Political Incivility on Social Media

    Full text link
    Researchers of political communication study the impact and perceptions of political incivility on social media. Yet, so far, relatively few works attempted to automatically detect and characterize political incivility. In our work, we study political incivility in Twitter, presenting several research contributions. First, we present state-of-the-art incivility detection results using a large dataset, which we collected and labeled via crowd sourcing. Importantly, we distinguish between uncivil political speech that is impolite and intolerant anti-democratic discourse. Applying political incivility detection at large-scale, we derive insights regarding the prevalence of this phenomenon across users, and explore the network characteristics of users who are susceptible to disseminating uncivil political content online. Finally, we propose an approach for modeling social context information about the tweet author alongside the tweet content, showing that this leads to significantly improved performance on the task of political incivility detection. This result holds promise for related tasks, such as hate speech and stance detection

    Relational social recommendation: Application to the academic domain

    Get PDF
    This paper outlines RSR, a relational social recommendation approach applied to a social graph comprised of relational entity profiles. RSR uses information extraction and learning methods to obtain relational facts about persons of interest from the Web, and generates an associative entity-relation social network from their extracted personal profiles. As a case study, we consider the task of peer recommendation at scientific conferences. Given a social graph of scholars, RSR employs graph similarity measures to rank conference participants by their relatedness to a user. Unlike other recommender systems that perform social rankings, RSR provides the user with detailed supporting explanations in the form of relational connecting paths. In a set of user studies, we collected feedbacks from participants onsite of scientific conferences, pertaining to RSR quality of recommendations and explanations. The feedbacks indicate that users appreciate and benefit from RSR explainability features. The feedbacks further indicate on recommendation serendipity using RSR, having it recommend persons of interest who are not apriori known to the user, oftentimes exposing surprising inter-personal associations. Finally, we outline and assess potential gains in recommendation relevance and serendipity using path-based relational learning within RSR

    A graph-search framework for associating gene identifiers with documents

    Get PDF
    BACKGROUND: One step in the model organism database curation process is to find, for each article, the identifier of every gene discussed in the article. We consider a relaxation of this problem suitable for semi-automated systems, in which each article is associated with a ranked list of possible gene identifiers, and experimentally compare methods for solving this geneId ranking problem. In addition to baseline approaches based on combining named entity recognition (NER) systems with a "soft dictionary" of gene synonyms, we evaluate a graph-based method which combines the outputs of multiple NER systems, as well as other sources of information, and a learning method for reranking the output of the graph-based method. RESULTS: We show that named entity recognition (NER) systems with similar F-measure performance can have significantly different performance when used with a soft dictionary for geneId-ranking. The graph-based approach can outperform any of its component NER systems, even without learning, and learning can further improve the performance of the graph-based ranking approach. CONCLUSION: The utility of a named entity recognition (NER) system for geneId-finding may not be accurately predicted by its entity-level F1 performance, the most common performance measure. GeneId-ranking systems are best implemented by combining several NER systems. With appropriate combination methods, usefully accurate geneId-ranking systems can be constructed based on easily-available resources, without resorting to problem-specific, engineered components

    Learning to rank typed graph walks: Local and global approaches

    No full text
    We consider the setting of lazy random graph walks over directed graphs, where entities are represented as nodes and typed edges represent the relations between them. This framework has been used in a variety of problems to derive an extended measure of entity similarity. In this paper we contrast two different approaches for applying supervised learning in this framework to improve graph walk performance: a gradient descent algorithm that tunes the transition probabilities of the graph, and a reranking approach that uses features describing global properties of the traversed paths. An empirical evaluation on a set of tasks from the domain of personal information management and multiple corpora show that reranking performance is usually superior to the local gradient descent algorithm, and that the methods often yield best results when combined

    Algorithms, Experimentation

    No full text
    Similarity measures for text have historically been an important tool for solving information retrieval problems. In many interesting settings, however, documents are structurally related to other documents, as well as other non-textual objects: for instance, email messages are connected to other messages via thread information embedded in the header. In this paper we consider extended similarity metrics for documents and other objects embedded in graphs, implemented via a lazy graph walk. We provide a detailed instantiation of this framework for email data, where content, social networks and a timeline are integrated in a structural graph. The suggested framework is evaluated for two email-related problems: disambiguating names in email documents, and threading. We show that reranking schemes based on the graph-walk similarity measures often outperform baseline methods, and that further improvements can be obtained by use of appropriate learning methods

    Algorithms, Experimentation

    No full text
    Similarity measures for text have historically been an important tool for solving information retrieval problems. In many interesting settings, however, documents are often closely connected to other documents, as well as other non-textual objects: for instance, email messages are connected to other messages via header information. In this paper we consider extended similarity metrics for documents and other objects embedded in graphs, facilitated via a lazy graph walk. We provide a detailed instantiation of this framework for email data, where content, social networks and a timeline are integrated in a structural graph. The suggested framework is evaluated for two email-related problems: disambiguating names in email documents, and threading. We show that reranking schemes based on the graph-walk similarity measures often outperform baseline methods, and that further improvements can be obtained by use of appropriate learning methods

    A Graphical Framework for Contextual Search and Name Disambiguation in Email

    No full text
    Similarity measures for text have historically been an important tool for solving information retrieval problems. In this paper we consider extended similarity metrics for documents and other objects embedded in graphs, facilitated via a lazy graph walk. We provide a detailed instantiation of this framework for email data, where content, social networks and a timeline are integrated in a structural graph. The suggested framework is evaluated for the task of disambiguating names in email documents. We show that reranking schemes based on the graph-walk similarity measures often outperform baseline methods, and that further improvements can be obtained by use of appropriate learning methods.
    corecore